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MAXIMUM LIKELIHOOD DEGREE OF FERMAT HYPERSURFACES 

VIA EULER CHARACTERISTICS 


BOTONG WANG 

Abstract. Maximum likelihood degree of a projective variety is the number of critical 
points of a general likelihood function. In this note, we compute the Maximum likeli¬ 
hood degree of Fermat hypersurfaces. We give a formula of the Maximum likelihood 
degree in terms of the constants /3 /ii j,, which is defined to be the number of complex 
solutions to the system of equations z" = = • • • = z" = 1 and Z\ -I -(- z^ + 1 = 0. 


1. INTRODUCTION 

The maximum likelihood estimate is a fundamental problem in statistics. Maximum 
likelihood degree is the number of potential solutions to the maximum likelihood esti¬ 
mation problem on a projective variety. When the variety is smooth, Huh [H] showed 
that the Maximum likelihood degree is indeed a topological invariant. If the variety is 
a general complete intersection, the maximum likelihood degree is computed in [CHKS] 
(see also [HS]). 

In a recent preprint [AAGL], Agostini, Alberelli, Grande and Leila studied the max¬ 
imum likelihood degree of Fermat hypersurfaces. They obtained formulas for the max¬ 
imum likelihood degree of a few special families of Fermat surfaces. However, their 
approach is through a case-by-case study. 

In this note, we propose to compute the Maximum likelihood degree of Fermat hyper¬ 
surfaces in a more systematic way via topological method. In general, the formula given 
in [CHKS] does not work for all the Fermat hypersurfaces, because the intersection of 
hyp ersur faces 

{xg + + • • • + x^ = 0} fl {xo + X\ + • • • + x n = 0} C IP" 

may not be transverse. We will compute the error terms introduced by the non-transverse 
intersections. The main ingredient is Milnor’s result on the topology of isolated hyper¬ 
surfaces singularities. This topological approach is closely related to the approach of 
[BW] and [RW]. In fact, for an isolated hypersurface singularity, the Euler obstruction 
is up to a sign equal to the Milnor number plus one. So we essentially apply the ideas 
of [BW] and [RW] to these particular examples. 

First, let us recall the definition of Maximum likelihood degree. Let P" be the n- 
dimensional complex projective space with homogeneous coordinates (x 0 , aq,..., x n ). 
Denote the coordinate plane {xi = 0} C F" by If*, and the hyperplane {^o + x\ -I— • + 
x n = 0} by H + . Let the index set A = {0,1,..., n, +}, and let % = (J x h H\. Let 
X C P" be a complex projective variety. Denote the smooth locus of X by V reg . The 
Maximum likelihood degree of X is defined to be the number of critical points of 
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the likelihood function 

rr U l . . . rr-Un 

J _ _ x 0 X 1 _ fri _ 

“ _ (x 0 + x 1 H-h X n ) U °+«l+‘**+«n 

on X reg \ T-L for generic {uf) 0 <i< n e Z n+1 . 

Theorem 1.1. Denote the Fermat hypersurface {xq + xf + • • • + x^ = 0} cP" by F nd , 
and denote its maximum likelihood degree by MLdeg(F nid ). Then, 

(1) MLdeg (F n4 )=d + d 2 + --- + d n - J2 ( n+1 )Pn-j, d - 1 

0<j<n-l \ d J 

where is the number of complex solutions of the system of equations 

DU D 1 

Z 1 ~ Z 2 — ■ ■ ■ — Z n — 1 

Z\ + ... + Zfj, + 1 = 0 . 

When /i or u is small, (3^ can be easily calculated. For example, 

(2) /V = 0 . 


(3) 


Pi ,v 


0 if v is odd, 
1 if v is even. 



With these calculations, we recover 


if v is divisible by 3, 
otherwise. 

all the closed formulas in [AAGL]. 


Corollary 1.2. 

(5) MLdeg(F„, 2 ) = 2 n+1 - 2 


( 6 ) 


MLdeg (F 2)d ) 


d 2 + d if d = 0,2 mod 6, 
d 2 + d — 3 if d = 3,5 mod 6, 
d 2 + d — 2 if d = 4 mod 6, 
^d 2 + d — 5 if d = 1 mod 6. 


When ^ is a power of a prime number, we have formulas to compute P^.u- Equivalently, 
when d — 1 is a power of a prime number, we have closed formulas for MLdeg(F„ i rf). In 
fact, by a straight forward computation one can deduce the following corollary from 
Theorem 1.1 and Proposition 4.2. 


Corollary 1.3. Suppose d — 1 = p r , where p is a prime number and r is a positive 
integer. Then 


MLdeg (F n} d) — d + d~ + • • • + d n — ——— 


(n + 1)! 

{n + 1 - p(si H-h s k ))\ ■ ((si)! • • • (saOD P 
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where k = and the sum is over all nonnegative integers Si,... ,Sk with 1 < SiH— • + 
Sk < — ■ 

K, - p 

To find a general formula for /3^ u would be a very hard question in number theory 
and combinatorics. In fact, determining when 7 ^ 0 had been an open question for a 
long time, and it was solved by Lam and Leung [LL] in 2000. 

Since the Fermat hypersurface F njd is smooth, by [H] MLdeg(F n; rf) is equal to the the 
signed Euler characteristic x(F n ,d \ 'H). In section 2, we will compute x(Fn )d \ H), and 
we will postpone the technical calculation of the Milnor numbers to section 3. In the 
last section, we will briefly discuss what we know about the constants (3 n ^ d . 

Acknowledgement. We thank Jiu-Kang Yu and Zhengpeng Wu for helpful discussions 
about the constants /3 


2. Computing the Euler characteristics 


By the following theorem of Huh [H], we reduce the problem of computing MLdeg(i 7 k rf) 
to computing x{F n , d \'H). Recall that in P n , H = {J XeA H\ is the union of all coordinate 
hyperplanes and the hyperplane H + = {xq + X\ + • • • + x n = 0}. 

Theorem 2.1 (Huh, [H]). If X C P n is a subvariety such that X\T~L is smooth, then 

MLdeg(A) = (_f) dim PO x (x \ u). 


Since the Euler characteristic is additive for algebraic varieties, by the inclusion- 
exclusion principle, 

(7) x(X\H)= Y. £(-!)MVntf A ,) 

0 <i<n A'cA 
I A' |=i 


where H\> = f| AG A' H a- 

The Fermat hypersurface F n ^ d = {xq + xf + • • • + x^ = 0} is invariant under any 
permutation of the coordinates. Therefore, (7) can be written as 


(8) x(f„ a \H)= Y (- 1 ) i ((” + 1 )#.UU+ 

\\ / \ / J 


Where V‘ = floge,-! and W‘ = H + n Oossi-r W = 0 and W 1 = H + ). 

F n ,d fl V 1 is a smooth hypersurface in P n_ * of degree d. Euler characteristics of such 
hypersurfaces only depend on n — i and d, and they are calculated in [D] Chapter 5, (3.7). 
However, it turns out that we don’t have to compute each of these Euler characteristics. 
For now, we simply denote the Euler characteristic of a smooth degree d hypersurfaces 
in P m by e mtd . In particular, 

(9) x{Fn,d n V 1 ) = e n - i4 . 

F n ,d D W l is a possibly singular hypersurface in W l for 1 < i < n. In fact, F n d fl W l 
is isomorphic to the intersection of the Fermat hypersurface F n _ i+ltd C p n_l+1 and the 
hyperplane {xo+£i + - • -+x n _j + 1 = 0}. Using Lagrange multiplier method, one can easily 
see that all the singular points of F n d fl W l are isolated and there are exactly f3 n -i + i^-i 
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many of them. The Euler characteristics of such hypersurfaces can be computed using 
Milnor numbers. 


Theorem 2.2. [D, Chapter 5 (4.4)] For any singular point P of F nd nW l we can define 
the Milnor number i-i(F nd n W\ P ) by considering F nd D W l as a hypersurface of W l . 
Then, 

( 10 ) x(F n ,d n w*) = e n . i4 + (-1)"-* Y K F n,d n W\ P) 

p 

where the sum is over all the singular points P of F n>d D W\ 

Proposition 2.3. For any singular point P of F n ^ d D W l , 

(11) ii(F n 4 nw i ,P) = l. 

We will postpone the proof of the proposition to next section. The next corollary 
follows immediately from ( 10 ) and ( 11 ). 


Corollary 2.4. 

(12) x(K,d n W*) = e n . i4 + (-1)”-^+!,^!. 

Now, combining (8), (9) and (12), we have 

(13) 

x(F n AH)= E(-D‘((" + 1 

0 <i<n ' ' 

Since ( n j“ 1 ) + (”^) = ( n ^ 2 ), (13) is equivalent to 

(14) x(FnAH)= E (-!)’(”1 2 )e«-w+ E (-!)”(" 

0 <i<n ' ' 1 <i<n ' ' 




i,d + ^ (fin—i,d T ( I)” */3n-i+l,d-l)^ 


Suppose X is a general hypersurface of degree d in P n . Then (7) implies that 

(15) X(X\U)= Y (- 1 )* 

0<i<n 

The maximum likelihood degree of a general hypersurfaces is well-understood. 

Proposition 2.5. [HS, 1.11] The maximum likelihood of a general degree d hypersurfaces 
in P n is equal to d+ d 2 +- b d n . 



Combining the proposition, (15) and Theorem 2.1, we have 

77 - 1 - 9 \ 

. )e n -i,d = x(X\n ) 

= (—l ) n_1 MLdeg(X) 

= (-l) n ~\d + d 2 + --- + d n ) 

Therefore, (14) is equivalent to 

(17) x(F„AW = (-l) n -'(d + d 2 + ---+d n )+ E (-!)"("!( 

l<i<n ' 
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Again, by Theorem 2.1, we have 

(18) MLdeg(F n , rf ) = d + d 2 + • • • + d n 



Pn 


—i+l,d—1 


which is the statement of Theorem 1.1. 


3. The Milnor numbers 


We prove Proposition 2.3 in this section. 

For the geometric meaning of Milnor number, we refer to [D, Chapter 3]. Here we 
compute the Milnor numbers using Jacobian ideals. Denote the ring of germs of holo- 
morphic functions at 0 £ C 1 by O. Let f E O be a nonzero germ of holomorphic 
function such that the germ of hypersurface / _ 1 ( 0 ) has an isolated singularity at the 
origin 0 G C l . The Jacobian ideal of /, denoted by Jf is defined by 


Jf = 



d£ 

'd Zl 


) c e> 


where are the coordinates of C n . 


Theorem 3.1. [D, Chapter 3, (2.7)] The Milnor number of f x (0) at the origin, denoted 
by /r(/ _ 1 ( 0 ), 0 ), is given by the formula 

(19) A t (/ _ 1 (0), 0) = dime O/ Jf. 

Recall that W l = {x 0 — x\ — ■■■ — Xj_ 2 = x 0 + x\ + • • • + x n = 0} C P n . Denote 
y 3 = 0<j<n — i + l. Then the intersection F nd D W l is isomorphic to the 

intersection 

{Vo + Vi +- 1 " Vn-i+i = 0 } fl {y 0 + yi + • • • + y n -i+ 1 = 0 } 

in p n ~*+ 1 . Without lost of generality, we can work on the affine space y o ^ 0, and rewrite 
the intersection in affine coordinates 

{1 + Vi +-1" Vn-i+i = 0} Cl {1 + i/i + • • • + y n -i +i = 0}. 

Here we use yj to denote the corresponding affine coordinate of yj, that is, yj = yj/y$- 
Suppose (^i,..., ^ n _j + i) is a singular point of the above intersection. Then by Lagrange 
multiplier method, 

(20) e t 1 = it' = ■■■ = ttU = i. 

We can eliminate y n -i+i by y n -i +1 = 1 — yi -• — y n -i- On this affine chart, F n d fl W % 

is isomorphic to the hypersurface {/ = 0} in C n ~ l , where 

( 21 ) / = 1 + y i +■■■ + y d n _i + (1 — 2/1 — • • • — yn-if ■ 

Let Zj = yj — fj. Then 

(22) / = 1 + {z\ + ^i) d + • • • + ( Z n -i + fn-i) d + (£n-i+l ~ Z\ — • ■ ■ — Z n -f) d . 

Proposition 3.2. In the local ring O, the Jacobian ideal Jf = ■ ■ ■, ) is equal 

to the maximal ideal (zi,Z 2 , ■ ■ ■, z n -i ). 
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Proof. Notice that ^ 1 = 1 for all 1 < j < n — i + 1. Therefore, 



did — 1) 2 , 

--- ■ [z i H-h Zj _i + 2/^- + z J+1 H-h + higher degree terms 


By Nakayama’s lemma, we only need to show that the vectors Z\ + ■ —V Zj _i + 2zj + 
Zj + 1 + • —h z n _i, 1 < j < n — i span the whole vector space Cz± © C z 2 © ■ • • © C z n -j. By 
adding all such vectors together, we see Z\ + z 2 + • • • + is contained in their span. 
Thus 

Zj = (zi + • • • + Zj -1 + 2 Zj + Zj- 1-1 + ' ' ' + Z n -i) — (z± + Z2 + • • • + Z n -i) 
is in the span. □ 

Now, Proposition 2.3 follows from Theorem 3.1 and Proposition 3.2. 

4. The constants f 3 ^ v 

Instead of working with the constants y , we dehne a^ v to be the number of complex 
solutions to the system of equations 

( 23 ) hi = z 2 = --- = z ; = 1 

|^1 + Z2 + ■ ■ • + Zfj, — 0 . 

Then clearly (3^ u = ^ ■ a tl +\. y . The advantage of working with a^ v is that their dehning 
equations have better symmetry. 

We would like to answer the following question. 

Question. Give a formula for a^ y in terms of /i and the prime factorization of v. 

This is dehnitely a very hard question. The work of Lam and Leung gives a necessary 
and sufficient condition of (x jt v ^ 0. 

Theorem 4.1. [LL] Suppose v = p" 1 • • -p“ ! is the prime factorization. Then a^ v ^ 0 if 
and only if p G Z> 0 • pi + • • • + Z> 0 • p t . 

When v = p r has only one prime factor, we can give a formula of ol^ v . In this case, 
suppose (zi ,..., Zff) is a solution to (23). Then the collection {zi ,..., z can be divided 
into groups of p elements such that each group is a rotation of 1, e 2m i p ,..., e 2 (p- 1 )Wp. 
Therefore, if p does not divide p, then a^ u = 0. If p divides /i, then 



(24) 


where k = v/p, and the sum is over all si, ..., e Z> 0 such that s± + ■ ■ ■ + Sk = /i/p. 
Since = k ■ cx^+i^, we can translate (24) into a statement about fd^. 

Proposition 4.2. Suppose u = p r , where p is a prime number and r is a positive integer. 
Then jd^ v = 0 when p does not divide p + 1, and when p divides p + 1 



( 25 ) 
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where k = v/p, and the sum is over all s ±,..., Sk G Z> 0 such that si + • • • + Sk = 

Suppose v = p r q s has two distinct prime factors, and suppose ( 24 ,..., z^) is a solution 
to (23). Then by [LL, Corollary 3.4], the collection {z \,..., z M } can be divided into 
groups of p or q elements such that each group is a rotation of 1 , e 2?rl//p ,..., e 2 ( p_1 Wp 
or a rotation of 1, e 2n ^ q , ..., e 2 ^ -1 ) 7 ™' 9 respectively. However, this decomposition is not 
unique, and this is the main difficulty to find a formula for ot^ )U in this case. Now, this 
is already a problem beyond our capability. 

When v has at least three distinct prime factors, the statement of [LL, Corollary 
3.4] is not true any more. Therefore, the question becomes much harder and deeper. 
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